3574 results found.
Written
Corpus,
Language Type:
Bilingual
Languages:
English Japanese
Availability:
From Owner
License:
MIT
Size:
38,062 tokens Production Status:
Newly created-finished
Use:
Information Extraction, Information Retrieval
-
Paper title:English Recipe Flow Graph Corpus
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yoko Yamakata | English recipe flow graph corpus | /N |
Documentation:
Publicly available in English and Japanese
Written
Lexicon,
Language Type:
Bilingual
Languages:
English French
Availability:
Freely Available
License:
Size:
492 entries Production Status:
Newly created-finished
Use:
Word Sense Disambiguation
-
Paper title:Dataset for Temporal Analysis of English-French Cognates
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Antoine Doucet | List of English-French Cognates | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Bilingual
Languages:
English Japanese
Availability:
Freely Available
License:
Research Purpose Only
Size:
1000 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:A Test Set for Discourse Translation from Japanese to English
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Masaaki Nagata | Japanese to English Discourse Translation Test Set | /N |
Documentation:
None
Written
Corpus,
Language Type:
Bilingual
Languages:
English French
Availability:
Metadata freely available and full texts available only for the French higher education and research community
License:
Open Licence Etalab for metadata and publisher type ISTEX licence for full texts
Size:
1161 KByte Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:An Experiment in Annotating Animal Species Names from ISTEX Resources
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sabine Barreaux | Animalia 100 | /N |
Documentation:
None
Written
Treebank,
Language Type:
Bilingual
Languages:
Czech English
Availability:
Freely Available
License:
CreativeCommons
Size:
50000 sentences Production Status:
Existing-updated
Use:
-
Paper title:Prague Dependency Treebank - Consolidated 1.0
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marie Mikulová | Prague Czech English Dependency Treebank 2.0 | /N |
Documentation:
http://ufal.mff.cuni.cz/pcedt2.0
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
Size:
Transcribed 3179 sentences; with data augmentation up to 18,570 sentences sentences Production Status:
Newly created-finished
Use:
Dialogue
-
Paper title:Augmenting Small Data to Classify Contextualized Dialogue Acts for Exploratory Visualization
-
Paper track:Multimodality/poster presentation
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Abhinav Kumar | CHICAGO-CRIME-VIS | /N |
Documentation:
Documentation: Several paper publications: (1) Kumar, Abhinav, et al. "Towards a dialogue system that supports rich visualizations of data." Proceedings of SIGDIAL 2016. and (2) Kumar, Abhinav, et al. "Towards Multimodal Coreference Resolution for Exploratory Data Visualization Dialogue: Context-Based Annotation and Gesture Identification." SEMDIAL 2017. The corpus transcriptions are in English language. Not currently available publicly.
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Cree English Indonesian Japanese Nungon Sesotho
Availability:
Freely Available
License:
CreativeCommons
Size:
1.1 GByte Production Status:
Newly created-finished
Use:
Acquisition
-
Paper title:The ACQDIV Corpus Database and Aggregation Pipeline
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Steven Moran | ACQDIV database (public) | /N |
Documentation:
https://github.com/acqdiv/acqdiv
Written
Lexicon,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Size:
300 entries Production Status:
Newly created-finished
Use:
Lexicon Creation/Annotation
-
Paper title:Semi-supervised Deep Embedded Clustering with Anomaly Detection for Semantic Frame Induction
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Zheng Xin Yong | Anomalous Lexical Units | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Bilingual
Languages:
English Vietnamese
Availability:
From Owner
License:
Obtaining one
Size:
~90,000 tokens Production Status:
Newly created-in progress
Use:
Language Identification
-
Paper title:CanVEC - the Canberra Vietnamese-English Code-switching Natural Speech Corpus
-
Paper track:Speech/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Li Nguyen | CanVEC | /N |
Documentation:
None
Written
Lexicon,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution 4.0 International
Size:
2521 words Production Status:
Newly created-finished
Use:
Opinion Mining/Sentiment Analysis
-
Paper title:Enhancing a Lexicon of Polarity Shifters through the Supervised Classification of Shifting Directions
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marc Schulder | Lexicon of Polarity Shifting Directions | /N |
Documentation:
English readme file is part of dataset.




